智能论文笔记

Predicting Parking Lot Availability by Graph-to-Sequence Model: A Case Study with SmartSantander

Yuya Sasaki , Junya Takayama , Juan Ramón Santana , Shohei Yamasaki , Tomoya Okuno , Makoto Onizuka

分类：机器学习

2022-06-21

如今，为了改善服务和城市地区的宜居性，全世界正在进行多个智能城市计划。 SmartSantander是西班牙桑坦德市的一个智能城市项目，该项目依靠无线传感器网络技术在城市内部部署异质传感器，以测量多个参数，包括户外停车信息。在本文中，我们使用SmartSantander的300多个户外停车传感器的历史数据研究了停车场可用性的预测。我们设计了一个图形模型，以捕获停车场的定期波动和地理位置。为了开发和评估我们的模型，我们使用了桑坦德市的3年停车场可用性数据集。与现有的序列到序列模型相比，我们的模型具有很高的精度，该模型足够准确，可以在城市提供停车信息服务。我们将模型应用于智能手机应用程序，以被公民和游客广泛使用。

translated by 谷歌翻译

Semi-supervised Fashion Compatibility Prediction by Color Distortion Prediction

Ling Xiao , Toshihiko Yamasaki

分类：计算机视觉

2022-12-27

Supervised learning methods have been suffering from the fact that a large-scale labeled dataset is mandatory, which is difficult to obtain. This has been a more significant issue for fashion compatibility prediction because compatibility aims to capture people's perception of aesthetics, which are sparse and changing. Thus, the labeled dataset may become outdated quickly due to fast fashion. Moreover, labeling the dataset always needs some expert knowledge; at least they should have a good sense of aesthetics. However, there are limited self/semi-supervised learning techniques in this field. In this paper, we propose a general color distortion prediction task forcing the baseline to recognize low-level image information to learn more discriminative representation for fashion compatibility prediction. Specifically, we first propose to distort the image by adjusting the image color balance, contrast, sharpness, and brightness. Then, we propose adding Gaussian noise to the distorted image before passing them to the convolutional neural network (CNN) backbone to learn a probability distribution over all possible distortions. The proposed pretext task is adopted in the state-of-the-art methods in fashion compatibility and shows its effectiveness in improving these methods' ability in extracting better feature representations. Applying the proposed pretext task to the baseline can consistently outperform the original baseline.

translated by 谷歌翻译

AugNet: Dynamic Test-Time Augmentation via Differentiable Functions

Shohei Enomoto , Monikka Roslianna Busto , Takeharu Eda

分类：计算机视觉

2022-12-09

Distribution shifts, which often occur in the real world, degrade the accuracy of deep learning systems, and thus improving robustness is essential for practical applications. To improve robustness, we study an image enhancement method that generates recognition-friendly images without retraining the recognition model. We propose a novel image enhancement method, AugNet, which is based on differentiable data augmentation techniques and generates a blended image from many augmented images to improve the recognition accuracy under distribution shifts. In addition to standard data augmentations, AugNet can also incorporate deep neural network-based image transformation, which further improves the robustness. Because AugNet is composed of differentiable functions, AugNet can be directly trained with the classification loss of the recognition model. AugNet is evaluated on widely used image recognition datasets using various classification models, including Vision Transformer and MLP-Mixer. AugNet improves the robustness with almost no reduction in classification accuracy for clean images, which is a better result than the existing methods. Furthermore, we show that interpretation of distribution shifts using AugNet and retraining based on that interpretation can greatly improve robustness.

translated by 谷歌翻译

Fresnel Microfacet BRDF: Unification of Polari-Radiometric Surface-Body Reflection

Tomoki Ichikawa , Yoshiki Fukao , Shohei Nobuhara , Ko Nishino

分类：计算机视觉

2022-12-08

Computer vision applications have heavily relied on the linear combination of Lambertian diffuse and microfacet specular reflection models for representing reflected radiance, which turns out to be physically incompatible and limited in applicability. In this paper, we derive a novel analytical reflectance model, which we refer to as Fresnel Microfacet BRDF model, that is physically accurate and generalizes to various real-world surfaces. Our key idea is to model the Fresnel reflection and transmission of the surface microgeometry with a collection of oriented mirror facets, both for body and surface reflections. We carefully derive the Fresnel reflection and transmission for each microfacet as well as the light transport between them in the subsurface. This physically-grounded modeling also allows us to express the polarimetric behavior of reflected light in addition to its radiometric behavior. That is, FMBRDF unifies not only body and surface reflections but also light reflection in radiometry and polarization and represents them in a single model. Experimental results demonstrate its effectiveness in accuracy, expressive power, and image-based estimation.

translated by 谷歌翻译

Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits

Shohei Wakayama , Nisar Ahmed

分类：机器人 | 机器学习

2022-09-19

在不确定性下的自动机器人决策中，必须考虑剥削和探索可用选项之间的权衡。如果可以利用与选项相关的次要信息，则此类决策问题通常可以作为上下文多臂强盗（CMAB）提出。在这项研究中，我们采用主动推断，该推断近年来在神经科学领域进行了积极研究，作为CMAB的替代行动选择策略。与常规的行动选择策略不同，在计算与决策代理人的概率模型相关的预期自由能（EFE）时，可以严格评估每种选项的不确定性，这是从自由能原理中得出的。我们专门解决了使用分类观察可能性函数的情况，因此EFE值在分析上是棘手的。我们介绍了基于变异和拉普拉斯近似值计算EFE的新近似方法。广泛的仿真研究结果表明，与其他策略相比，主动推断通常需要迭代率要少得多，以识别最佳选择并普遍实现累积累积的遗憾，以相对较低的额外计算成本。

translated by 谷歌翻译

MRI-MECH: Mechanics-informed MRI to estimate esophageal health

Sourav Halder , Ethan M. Johnson , Jun Yamasaki , Peter J. Kahrilas , Michael Markl , John E. Pandolfino , Neelesh A. Patankar

分类：机器学习

2022-09-15

动态磁共振成像（MRI）是一种流行的医学成像技术，可生成组织和器官内部对比度材料流动的图像序列。但是，仅在少数可行性研究中证明了它在通过食道运动中的成像运动中的应用，并且相对尚未探索。在这项工作中，我们提出了一个称为力学的MRI（MRI-MEC）的计算框架，该计算框架增强了该能力，从而增加了动态MRI在诊断食管疾病中的适用性。菠萝汁用作动态MRI的吞咽对比材料，MRI图像序列被用作MRI-MECH的输入。 MRI-MECH将食道建模为柔性的一维管，弹性管壁遵循线性管定律。然后，通过一维质量和动量保护方程式，通过食道流动。这些方程是使用物理信息的神经网络（PINN）求解的。 PINN最大程度地减少了MRI测量和模型预测之间的差异，以确保始终遵循流体流量问题的物理。 MRI-Mech计算了食管转运期间的流体速度和压力，并通过计算壁刚度和主动弛豫来估计食道健康的机械健康。此外，MRI-Mech预测了在排空过程中有关下食管下括约肌的缺失信息，这证明了其适用于缺少数据或图像分辨率差的方案。除了基于食管机械健康的定量估计值来改善临床决策外，MRI-MECH还可以增强用于应用其他医学成像方式以增强其功能。

translated by 谷歌翻译

Langevin Autoencoders for Learning Deep Latent Variable Models

Shohei Taniguchi , Yusuke Iwasawa , Wataru Kumagai , Yutaka Matsuo

分类：机器学习 | (统计)机器学习

2022-09-15

马尔可夫链蒙特卡洛（MCMC），例如langevin Dynamics，有效地近似顽固的分布。但是，由于昂贵的数据采样迭代和缓慢的收敛性，它的用法在深层可变模型的背景下受到限制。本文提出了摊销的langevin Dynamics（ALD），其中数据划分的MCMC迭代完全被编码器的更新替换为将观测值映射到潜在变量中。这种摊销可实现有效的后验采样，而无需数据迭代。尽管具有效率，但我们证明ALD是MCMC算法有效的，其马尔可夫链在轻度假设下将目标后部作为固定分布。基于ALD，我们还提出了一个名为Langevin AutoCodeer（LAE）的新的深层变量模型。有趣的是，可以通过稍微修改传统自动编码器来实现LAE。使用多个合成数据集，我们首先验证ALD可以从目标后代正确获取样品。我们还在图像生成任务上评估了LAE，并证明我们的LAE可以根据变异推断（例如变异自动编码器）和其他基于MCMC的方法在测试可能性方面胜过现有的方法。

translated by 谷歌翻译

Improving Robustness to Out-of-Distribution Data by Frequency-based Augmentation

Koki Mukai , Soichiro Kumano , Toshihiko Yamasaki

分类：计算机视觉

2022-09-06

尽管卷积神经网络（CNN）在图像识别方面具有很高的精度，但它们容易受到对抗性示例和分布数据的影响，并且已经指出了人类识别的差异。为了提高针对分布数据的鲁棒性，我们提出了一种基于频率的数据增强技术，该技术将频率组件用同一类的其他图像替换。当培训数据为CIFAR10并且分发数据的数据为SVHN时，使用该方法训练的模型的接收器操作特征（AUROC）曲线从89.22 \％\％增加到98.15 \％，并进一步增加到98.59\％与另一种数据增强方法结合使用。此外，我们在实验上证明了分布外数据的可靠模型使用图像的许多高频组件。

translated by 谷歌翻译

Prediction of Seismic Intensity Distributions Using Neural Networks

Koyu Mizutani , Haruki Mitarai , Kakeru Miyazaki , Ryugo Shimamura , Soichiro Kumano , Toshihiko Yamasaki

分类：计算机视觉

2022-08-16

地面运动预测方程通常用于预测地震强度分布。但是，将这种方法应用于受地下板结构影响的地震分布并不容易，这通常称为异常地震分布。这项研究提出了使用神经网络进行回归和分类方法的混合体。提出的模型将分布视为二维数据，如图像。我们的方法可以准确预测地震强度分布，甚至异常分布。

translated by 谷歌翻译

nLMVS-Net: Deep Non-Lambertian Multi-View Stereo

Kohei Yamashita , Yuto Enyo , Shohei Nobuhara , Ko Nishino

分类：计算机视觉

2022-07-25

我们介绍了一种新型的多视图立体声（MVS）方法，该方法不仅可以同时恢复每个像素深度，而且还可以恢复表面正常状态，以及在已知但自然照明下捕获的无纹理，复杂的非斜面表面的反射。我们的关键想法是将MVS作为端到端的可学习网络，我们称为NLMVS-NET，该网络无缝地集成了放射线线索，以利用表面正常状态作为视图的表面特征，以实现学习成本量的构建和过滤。它首先通过新颖的形状从阴影网络估算出每个视图的像素概率密度。然后，这些每个像素表面正常密度和输入多视图图像将输入到一个新颖的成本量滤波网络中，该网络学会恢复每个像素深度和表面正常。通过与几何重建交替进行交替估计反射率。对新建立的合成和现实世界数据集进行了广泛的定量评估表明，NLMVS-NET可以稳健而准确地恢复自然设置中复杂物体的形状和反射率。

translated by 谷歌翻译